Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose learning rate parameter #9

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Expose learning rate parameter #9

wants to merge 2 commits into from

Conversation

fcdimitr
Copy link
Owner

@fcdimitr fcdimitr commented Jun 9, 2021

This pull request resolves the following comment from @parashardhapola

Hi,

I found that learning rate parameter (eta) in the KL minimization step (gradient_descend.cpp) has not been exposed and is hard coded to 200.
Is it possible to expose it in sgtsne (sgtsne.cpp) and also in demo_stochastic_matrix.cpp?

Learning rate is a crucial parameter for obtaining high quality data embedding and it will be nice to give user the ability to increase the value for larger datasets. The paper below has a nice discussion about this:
https://www.nature.com/articles/s41467-019-13056-x

Thanks

Best regards,
Parashar

@dkobak
Copy link

dkobak commented May 6, 2021

Related to this, I am wondering if your implementation uses the factor of 4 in the gradient? See Eq. 5 in https://github.com/fcdimitr/sgtsnepi#approximation-of-the-gradient. Many implementations, following the original van der Maaten's implementation, leave the factor 4 out, but some do not. This of course has no influence on anything, apart from scaling the learning rate by the same factor of 4. So it would be good to know if your code has factor 4 on it or not.

@fcdimitr
Copy link
Owner

fcdimitr commented Jun 9, 2021

@parashardhapola 👋 please test the branch and let me know if it resolves your issues

Related to this, I am wondering if your implementation uses the factor of 4 in the gradient? See Eq. 5 in https://github.com/fcdimitr/sgtsnepi#approximation-of-the-gradient. Many implementations, following the original van der Maaten's implementation, leave the factor 4 out, but some do not. This of course has no influence on anything, apart from scaling the learning rate by the same factor of 4. So it would be good to know if your code has factor 4 on it or not.

@dkobak SG-t-SNE-Π does not have the factor 4 in the gradient

@fcdimitr fcdimitr requested a review from ailiop June 9, 2021 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants